智能论文笔记

DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders

Xiaoyang Kang , Tao Yang , Wenqi Ouyang , Peiran Ren , Lingzhi Li , Xuansong Xie

分类：计算机视觉

2022-12-22

Automatic image colorization is a particularly challenging problem. Due to the high illness of the problem and multi-modal uncertainty, directly training a deep neural network usually leads to incorrect semantic colors and low color richness. Existing transformer-based methods can deliver better results but highly depend on hand-crafted dataset-level empirical distribution priors. In this work, we propose DDColor, a new end-to-end method with dual decoders, for image colorization. More specifically, we design a multi-scale image decoder and a transformer-based color decoder. The former manages to restore the spatial resolution of the image, while the latter establishes the correlation between semantic representations and color queries via cross-attention. The two decoders incorporate to learn semantic-aware color embedding by leveraging the multi-scale visual features. With the help of these two decoders, our method succeeds in producing semantically consistent and visually plausible colorization results without any additional priors. In addition, a simple but effective colorfulness loss is introduced to further improve the color richness of generated results. Our extensive experiments demonstrate that the proposed DDColor achieves significantly superior performance to existing state-of-the-art works both quantitatively and qualitatively. Codes will be made publicly available.

translated by 谷歌翻译

Boosting Point Clouds Rendering via Radiance Mapping

Xiaoyang Huang , Yi Zhang , Bingbing Ni , Teng Li , Kai Chen , Wenjun Zhang

分类：计算机视觉

2022-10-27

Recent years we have witnessed rapid development in NeRF-based image rendering due to its high quality. However, point clouds rendering is somehow less explored. Compared to NeRF-based rendering which suffers from dense spatial sampling, point clouds rendering is naturally less computation intensive, which enables its deployment in mobile computing device. In this work, we focus on boosting the image quality of point clouds rendering with a compact model design. We first analyze the adaption of the volume rendering formulation on point clouds. Based on the analysis, we simplify the NeRF representation to a spatial mapping function which only requires single evaluation per pixel. Further, motivated by ray marching, we rectify the the noisy raw point clouds to the estimated intersection between rays and surfaces as queried coordinates, which could avoid \textit{spatial frequency collapse} and neighbor point disturbance. Composed of rasterization, spatial mapping and the refinement stages, our method achieves the state-of-the-art performance on point clouds rendering, outperforming prior works by notable margins, with a smaller model size. We obtain a PSNR of 31.74 on NeRF-Synthetic, 25.88 on ScanNet and 30.81 on DTU. Code and data are publicly available at https://github.com/seanywang0408/RadianceMapping.

translated by 谷歌翻译

Federated Meta-Learning for Traffic Steering in O-RAN

Hakan Erdol , Xiaoyang Wang , Peizheng Li , Jonathan D. Thomas , Robert Piechocki , George Oikonomou , Rui Inacio , Abdelrahim Ahmad , Keith Briggs , Shipra Kapoor

分类：机器学习

2022-09-13

与LTE网络相比，5G的愿景在于提供较高的数据速率，低延迟（为了实现近实时应用程序），大大增加了基站容量以及用户的接近完美服务质量（QoS）。为了提供此类服务，5G系统将支持LTE，NR，NR-U和Wi-Fi等访问技术的各种组合。每种无线电访问技术（RAT）都提供不同类型的访问，这些访问应在用户中对其进行最佳分配和管理。除了资源管理外，5G系统还将支持双重连接服务。因此，网络的编排对于系统经理在旧式访问技术方面来说是一个更困难的问题。在本文中，我们提出了一种基于联合元学习（FML）的大鼠分配算法，该算法使RAN Intelligent Controller（RIC）能够更快地适应动态变化的环境。我们设计了一个包含LTE和5G NR服务技术的模拟环境。在模拟中，我们的目标是在传输的截止日期内满足UE需求，以提供更高的QoS值。我们将提出的算法与单个RL试剂，爬行动物算法和基于规则的启发式方法进行了比较。仿真结果表明，提出的FML方法分别在第一部部署回合21％和12％时达到了较高的缓存率。此外，在比较方法中，提出的方法最快地适应了新任务和环境。

translated by 谷歌翻译

Subtype-Former: a deep learning approach for cancer subtype discovery with multi-omics data

Hai Yang , Yuhang Sheng , Yi Jiang , Xiaoyang Fang , Dongdong Li , Jing Zhang , Zhe Wang

分类：机器学习

2022-07-28

动机：癌症是异质的，影响了个性化治疗的精确方法。准确的亚型可以导致癌症患者的生存率更好。高通量技术为癌症亚型提供了多个OMIC数据。但是，由于OMICS数据的大量和高维度，精确的癌症亚型仍然具有挑战性。结果：这项研究提出了基于MLP和变压器块的深度学习方法拟议的亚型形式，以提取多摩学数据的低维表示。 K-均值和共识聚类也用于获得准确的亚型结果。我们比较了TCGA 10癌症类型的其他最先进的亚型方法。我们发现，基于生存分析，亚型形式可以在5000多个肿瘤的基准数据集上表现更好。此外，亚型形式还取得了泛滥亚型的出色结果，这可以帮助分析分子水平上各种癌症类型的共同点和差异。最后，我们将亚型格式应用于TCGA 10类型的癌症。我们确定了50种基本生物标志物，可用于研究靶向癌症药物并促进精密医学时代的癌症治疗。

translated by 谷歌翻译

Variational Autoencoder Assisted Neural Network Likelihood RSRP Prediction Model

Peizheng Li , Xiaoyang Wang , Robert Piechocki , Shipra Kapoor , Angela Doufexi , Arjun Parekh

分类：机器学习

2022-06-27

衡量移动数据的客户体验对于全球移动运营商来说至关重要。收到的参考信号（RSRP）是当前移动网络管理，评估和监视的重要指标之一。通过最小化驱动器测试（MDT）（一种3GPP标准技术）收集的无线电数据通常用于无线网络分析。在不同地理区域收集MDT数据效率低下，受地形条件和用户的存在限制，因此对于动态无线电环境来说不是足够的技术。在本文中，我们研究了RSRP预测，利用MDT数据和数字双胞胎（DT）的生成模型，并提出了数据驱动的两层神经网络（NN）模型。在第一层中，与用户设备（UE）相关的环境信息，基站（BS）和网络关键性能指标（KPI）是通过变量自动编码器（VAE）提取的。第二层被设计为可能性模型。在这里，采用了环境功能和实际MDT数据功能，制定了集成的培训过程。在验证中，我们提出的使用现实世界数据的模型表明，与经验模型相比，与完全连接的预测网络相比，与经验模型相比，精度提高了约20％或更多。

translated by 谷歌翻译

Sim2real for Reinforcement Learning Driven Next Generation Networks

Peizheng Li , Jonathan Thomas , Xiaoyang Wang , Hakan Erdol , Abdelrahim Ahmad , Rui Inacio , Shipra Kapoor , Arjun Parekh , Angela Doufexi , Arman Shojaeifard

分类：机器学习

2022-06-08

下一代网络将积极采用人工智能（AI）和机器学习（ML）技术，用于自动化网络和最佳网络操作策略。以Open Ran（O-Ran）为代表的新兴网络结构符合这一趋势，其规范中心的无线电智能控制器（RIC）用作ML应用程序主机。各种ML模型，尤其是强化学习（RL）模型，被认为是解决与RAN相关的多目标优化问题的关键。但是，应该认识到，当前大多数RL成功都局限于抽象和简化的仿真环境，这可能不会直接转化为复杂的真实环境中的高性能。主要原因之一是模拟与真实环境之间的建模差距，这可能会使RL代理通过模拟训练不适合真实环境。此问题称为SIM2REAL差距。本文在O-Ran的背景下引起了SIM2REAL挑战。具体而言，它强调了数字双胞胎（DT）可以作为模型开发和验证的地方的特征和好处。提出了几种用例，以举例说明并证明在真实环境中训练有训练的RL模型的故障模式。讨论了DT在协助RL算法开发方面的有效性。然后提出了通常用于克服SIM2REAL挑战的基于学习的基于艺术学习的方法。最后，从数据交互，环境瓶颈和算法设计等潜在问题的角度讨论了O-RAN中RL应用程序实现的开发和部署问题。

translated by 谷歌翻译

BiFSMN: Binary Neural Network for Keyword Spotting

Haotong Qin , Xudong Ma , Yifu Ding , Xiaoyang Li , Yang Zhang , Yao Tian , Zejun Ma , Jie Luo , Xianglong Liu

分类：自然语言处理

2022-02-14

深处神经网络（例如Deep-FSMN）已被广泛研究以用于关键字发现（KWS）应用。但是，这些网络的计算资源通常受到重大限制，因为它们通常在边缘设备上在通话中运行。在本文中，我们提出了BIFSMN，这是KWS的准确且极高的二元神经网络。我们首先为二进制化训练构建了高频增强蒸馏方案，该方案强调了全优先网络表示的高频信息，这对于对二进制网络的优化更为重要。然后，为了在运行时允许即时和自适应的准确性效率折衷，我们还提出了一个可稀薄的二进制架构，以从拓扑角度进一步解放二进制网络的加速潜力。此外，我们在ARMV8设备上为BIFSMN实施了快速的位计算内核，该内核充分利用了寄存器并增加了指令吞吐量以突破部署效率的极限。广泛的实验表明，BIFSMN通过说服各种数据集的利润率优于现有的二进制方法，甚至与全精度对应物相当（例如，语音命令v1-12下降少于3％）。我们强调的是，BIFSMN受益于稀薄的体系结构和优化的1位实现，可以在现实世界中的Edge硬件上实现令人印象深刻的22.3倍加速和15.5倍的存储空间。

translated by 谷歌翻译

RLOps: Development Life-cycle of Reinforcement Learning Aided Open RAN

Peizheng Li , Jonathan Thomas , Xiaoyang Wang , Ahmed Khalil , Abdelrahim Ahmad , Rui Inacio , Shipra Kapoor , Arjun Parekh , Angela Doufexi , Arman Shojaeifard

分类：机器学习

2021-11-12

无线电接入网络（RAN）技术继续见证巨大的增长，开放式运行越来越最近的势头。在O-RAN规范中，RAN智能控制器（RIC）用作自动化主机。本文介绍了对O-RAN堆栈相关的机器学习（ML）的原则，特别是加强学习（RL）。此外，我们审查无线网络的最先进的研究，并将其投入到RAN框架和O-RAN架构的层次结构上。我们在整个开发生命周期中提供ML / RL模型面临的挑战的分类：从系统规范到生产部署（数据采集，模型设计，测试和管理等）。为了解决挑战，我们将一组现有的MLOPS原理整合，当考虑RL代理时，具有独特的特性。本文讨论了系统的生命周期模型开发，测试和验证管道，称为：RLOPS。我们讨论了RLOP的所有基本部分，包括：模型规范，开发和蒸馏，生产环境服务，运营监控，安全/安全和数据工程平台。根据这些原则，我们提出了最佳实践，以实现自动化和可重复的模型开发过程。

translated by 谷歌翻译

Understanding Imbalanced Semantic Segmentation Through Neural Collapse

Zhisheng Zhong , Jiequan Cui , Yibo Yang , Xiaoyang Wu , Xiaojuan Qi , Xiangyu Zhang , Jiaya Jia

分类：计算机视觉 | 机器学习

2023-01-03

A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.

translated by 谷歌翻译

Discrimination, calibration, and point estimate accuracy of GRU-D-Weibull architecture for real-time individualized endpoint prediction

Xiaoyang Ruan , Liwei Wang , Michelle Mai , Charat Thongprayoon , Wisit Cheungpasitporn , Hongfang Liu

分类：机器学习

2022-12-19

Real-time individual endpoint prediction has always been a challenging task but of great clinic utility for both patients and healthcare providers. With 6,879 chronic kidney disease stage 4 (CKD4) patients as a use case, we explored the feasibility and performance of gated recurrent units with decay that models Weibull probability density function (GRU-D-Weibull) as a semi-parametric longitudinal model for real-time individual endpoint prediction. GRU-D-Weibull has a maximum C-index of 0.77 at 4.3 years of follow-up, compared to 0.68 achieved by competing models. The L1-loss of GRU-D-Weibull is ~66% of XGB(AFT), ~60% of MTLR, and ~30% of AFT model at CKD4 index date. The average absolute L1-loss of GRU-D-Weibull is around one year, with a minimum of 40% Parkes serious error after index date. GRU-D-Weibull is not calibrated and significantly underestimates true survival probability. Feature importance tests indicate blood pressure becomes increasingly important during follow-up, while eGFR and blood albumin are less important. Most continuous features have non-linear/parabola impact on predicted survival time, and the results are generally consistent with existing knowledge. GRU-D-Weibull as a semi-parametric temporal model shows advantages in built-in parameterization of missing, native support for asynchronously arrived measurement, capability of output both probability and point estimates at arbitrary time point for arbitrary prediction horizon, improved discrimination and point estimate accuracy after incorporating newly arrived data. Further research on its performance with more comprehensive input features, in-process or post-process calibration are warranted to benefit CKD4 or alike terminally-ill patients.

translated by 谷歌翻译